DOCUMENT RESUME 



ED 315 451 

AUTHOR 
TITLE 

PUB DATE 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRTl'URS 



IDENTIFIERS 



TM 014 47P 

Daniel, Larry G. 

Use of Structure Coefficients in Multivariate 
Educational Research: A Heuristic Example. 
Jan 90 

24p.; Paper presented at the Annual Meeting of the 
Southwest Educational Research Association (Austin, 
TX, January 25-27, 1990). 
Reports - Evaluative/Feasibility (142) — 
Speeches/Conference Papers (150) 

MF01/PC01 Plus Postage. 

*Analysls of Variance; ^Educational Research; 
•Heuristics; *Multivariate Analysis; *Predictor 
Variables 

Collinearity; *Linear Discriminant Function; *Type 
Errors 



ABSTRACT 

A small multivariate data set is used to illustrate 
the usefulness of structure coefficients when interpreting results of 
educational experiments. Data are analyzed using a multivariate 
analysis of variance (MANOVA) , and results are interpreted in three 
different ways to determine the contribution of individual variables 
to prediction: (1) using multiple analyses of variance following a 
statistically significant MANOVA; (2) using standardized linear 
discriminant function coefficients; and (3) using structure 
coefficients. The use of structure coefficients is shown to be 
superior to the other methods as structure coefficients honor the 
multivariate reality of the data, minimize experiment-wise Type I 
error rates, and are not inflated or suppressed by collinearity among 
variables. Five data tables are included. (Author/TJH) 
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ABSTRACT 

A small multivariate data set is utilized to illustrate the 
usefulness of structure coefficients when interpreting results of 
educational experiments. Data are analyzed using a multivariate 
analysis of variance (MANOVA) , and results are interpreted in 
three different ways to determine the contribution of individual 
variables to prediction— (a) using multiple ANOVAs following a 
statistically significant MANOVA, (b) using standardized linear 
discriminant function coefficients, and (c) using structure 
coefficients. The use of structure coefficients is shown to be 
superior to these other methods as structure coefficients 
appropriately honor the multivariate reality of the data, 
minimize experiment-wise Type I error rates, and are not inflated 
or suppressed by collinearity among variables. 



Use of Structure Coefficients in Multivariate Educational 

Research: A Heuristic Example 

Multivariate statistical methods (i.e., methods employing 
data sets in which the n of dependent variables >. 2) are 
desirable in that they not only reduce the risk of high 
experimentwise Type I error rates associated with studies 
employing multiple univariate tests, but they also tend to honor 
the reality of relationships among the variables under study 
(Fish, 1988). Although the advent of the computer and numerous 
"user-friendly" statistical packages have made these 
mathematically-complex multivariate methods available to even the 
most non-mathematically oriented researchers (Haase & Ellis, 
1987; McMillan & Schumacher, 1984), these techniques still 
account for only a small percentage of statistical techniques 
used in various educational and psychological research journals 
(Elmore & Woehlke, 1988; Goodwin & Goodwin, 1985a, 1985b; 
Willson, 1980) . 

Goodwin and Goodwin (1985b) suggest two possible reasons for 
the absence of use of these more advanced statistical methods: 
(a) that the majority of research questions of importance to 
educational and psychological researchers are appropriately 
addressed using less sophisticated univariate or descriptive 
techniques, and (b) that numerous researchers are unfamiliar with 
these methods and therefore are less likely to use them. 
Although less advanced descriptive or univariate statistics are 
appropriate for certain research situations, many (e.g., Fish, 
1980; Hopkins, 1980; Kerlinger, 1986; Thompson, 1986) have argued 
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convincingly that behavioral Research is generally characterized 
by a complex set of highly interrelated t variables, and that 
multivariate methods best honor the relationships among 
variables. Consequently, as Kerlinger (1979, p. 208) has noted, 
one cannot understand contemporary behavioral research without a 
fairly good understanding of multivariate approaches and 
methods. " 

Considering the wealth of scholars who support the 
appropriateness of using multivariate statistical methods in most 
behavioral research situations, it would follow that Goodwin and 
Goodwin's (1985b) second reason for the absence of use of these 
methods (i.e., that researchers are unfamiliar with these 
methods) is in many cases the more likely one. Furthermore, even 
if a researcher has a precursory knowledge of a particular 
multivariate method, he or she may be hesitant to employ the 
method knowing that multivariate results are often difficult to 
interpret, especially when there is a high degree of correlation 
among the several variables in the dependent variable set. 

Bray and Maxwell (1982), Haase and Ellis (1987), Huberty and 
Morris (1989), and Share (1984) reviewed several statistical 
techniques useful in dealing with this problem of multivariate 
"collinearity. " Their discussions are particulary appropriate to 
research situations characterized by a high degree of 
collinearity among outcome variables, and involving the initial 
application of multivariate analyses of variance (MANOVAs) . For 
instance, in a research situation employing a MANOVA with three 
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predictor and two highly-interrelated criterion variables, the 
researcher may be unsure how to interpret results even if results 
are statistically significant and of a notable effect size. 
According to Huberty and Morris (1989, p. 304), at least three 
interpretation problems can arise from this research situation: 
(a) the "variable selection problem," i.e., the determination of 
which of the several variables account for categorical 
differences among subjects; (b) the "variable ordering problem," 
i.e., the determination of the relative contribution of each 
outcome variable to resultant group differences; and (c) the 
problem of interpreting underlying constructs that can be 
identified in the variable system structure on the basis of 
MANOVA results. The third of these interpretation problems 
(interpretation of underlying constructs) will be addressed here. 

A number of techniques for dealing with identification of 
constructs underlying variables in multivariate research have 
been suggested. These techniques include (but are not limited 
to) (a) following up a statistically significant MANOVA with 
multiple univariate analyses of variance (JNOVA) tests to 
determine the effect of the variables in the predictor set on 
each of the outcome variables, (b) interpreting linear 
composites (i.e., standardized linear discriminant function 
coefficients) of outcome variables to determine which variables 
contribute to underlying constructs identified in the study, and 
(c) interpreting structure coefficients (correlations between 
each outcome variable and the linear discriminant function) . A 
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brief review of the literature relative to the use of each of 
these three techniques follows , 

MANQVA Followed by Multiple ANOVAs 

When a multivariate analysis of variance (MANOVA) yields 
statistically significant results, many researchers routinely 
follow up the MANOVA with multiple ANOVAs in an attempt to 
determine which outcome variables account for the majority of 
differences across the independent variables. One serious 
problem with this approach to interpreting MANOVA results is the 
potential for escalation of the experimentwise Type I error rate 
(Bray & Maxwell, 1982? Huberty & Morris, 1989; Share, 1984). As 
Huberty and Morris (1989, p. 306) have noted: 

Whenever multiple statistical tests are carried out 
in inferential data analysis, there is a potential 
problem of "probability pyramiding." Use of 
conventional levels of Type I error probabilities 
[e.g., 1%, 5%, 10%] for each test in a series of 
statistical tests may yield an unacceptably high 
Type I error probability across all of the tests 
(the "experimentwise error rate"). 
Thompson (1986) also addresses the problem with escalation 
of experimentwise error rates when multiple tests are used: 
. . .the experimentwise error rate is a function of 
the degree of correlation between the variables 
being studied, and of the number of statistical 
significance tests conducted based on data from the 
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same subjects. The experimentwise Type I error 
rate will be at least equal to the alpha level 
choosen [sic] for each individual test. . . . [T]he 
experimentwise Type I error rate may be as high as 
1 - (1 - alpha) raised to the Jc power, where k is 
the number of statistical tests conducted. For 
example, if 20 t-tests using an alpha of .05 are 
conducted based on data from the same subjects, the 
experimentwise error rate will range somewhere 
between 5% and. . .64.2%. (p. 6) 
A second problem associated with the use of multiple ANOVAs 
following a statistically significant MANOVA is that the two 
analyses address very different research questions (Huberty & 
Morris, 1989; Share, 1984). Univariate procedures fail to honor 
the reality of the linear combinations of the several outcome 
variables being studied in a multivariate research situation, 
and, in essence, the reality of the behaviors represented by the 
variables. As Haase and Ellis (1987, p. 405) note, "univariate 
test statistics. . .are based on the assumption that the 
correlations among the dependent variables are zero." Hence, in 
discriminant analysis where the goal is to identify which 
underlying constructs best account for group differences, "it is 
unlikely to be the case that the major differences lie solely in 
single variables, but rather in combinations of variables such as 

subsets, or differences between subsets" (Share, 1984, p. 352 

emphasis in original) . 
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In the words of Thompson (1986, pp. 9-10), "Only methods 
which simultaneously consider the full network of variable 
relationships honor a reality in which the full network of 
variables may operate simultaneously on each other" (emphasis 
added). Similarly, Eason and Daniel (1989, p. l) note that the 
"use of statistical techniques which do not honor the true 
relationships among the variables under study may cause the 
researcher to draw inaccurate conclusions about causality or 
correlation among variables." Haase and Ellis (1987, p. 405) 
provide an interesting example which illustrates the advantage of 
maintaining the multivariate reality of experimental variables: 
Height and weight, for example, may be analyzed 
independently (univariately) , and this analysis may 
yield conclusions about height and weight. An 
analysis of the optimal linear combination of 
height and weight (multivariate) , however, would 
probably be interpreted as an analysis of the 
concept size. such truly multivariate modeling 
simply cannot be addressed by separate univariate 
analyses. (emphasis in original) 
Interestingly, these two arguments against interpreting 
MANOVA results by consulting multiple univariate ANOVA F tests 
also serve as good arguments for the use of multivariate methods 
in research situations in which multiple outcome variables are 
inherently related. [See Fish (1988) and Thompson (1986) for 
understandable treatises on the importance of using multivariate 
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methods in behavioral research.] Consequently, considering the 
problems associated with following up MANOVA with ANOVAs, and the 
inability of univariate methods to adequately address behavioral 
reality, Huberty and Morris (1989, p. 302) conclude that this 
approach to interpreting MANOVA results is "seldom, if ever, 
appropriate. " 

Interpr eting Discriminant Function Coefficients 

A second method for interpreting effects in multivariate 
analysis of variance is to follow-up the MANOVA with discriminant 
analysis. Some researchers (e.g. McQuarrie & Grotelueschen, 
1971) use the resulting standardized discriminant function 
coefficients to determine outcome variable contributions to the 
identification of underlying constructs. Although this method 
does consider the multivariate relationships among the variables 
under study, Bray and Maxwell (1982) and Huberty and Smith (1982) 
caution that the use of these coefficients when outcome variables 
are highly intercorrelated may lead to erroneous conclusions 
about the contributions of a given variable. Hence, Bray and 
Maxwell (1982) have noted, "Discriminant functions can change 
drastically with the addition or deletion of one or more 
variables" (p. 345) . 

Huberty and Morris (1989, p. 304) concur, noting a 
particular problem with the replicability of MANOVA results when 
interpreting discriminant function coefficients: 

What a good variable subset or a relatively good 
individual variable is depends upon the collection 
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of the variables in the system being studied. How 
well the proposed selection and ordering results 
hold up over repeated sampling needs to be 
addressed with further empirical study. Of course, 
replication is highly desirable. The rank-order 
position of a given variable in a system of 
variables may change when new variables are added 
to the system. . . .Hence, a conclusion regarding 
the goodness of a variable subset and the relative 
goodness of indi vidual variables must be made with 
some caution . (emphasis added) 

Interp reting Structure Coefficients 

A third method for interpreting MANOVA results is to consult 
structure coefficients in addition to oi: instead of function 
coefficients. Structure coefficients (or canonical variate 
correlations) express correlations between each outcome variable 
and the linear composite 01 all the outcome variables (i.e., the 
"synthetic" or "canonical" variate). since structure 

coefficients are not affected by variable collinearity , it is 
proposed that structure coefficients produce more stable 
estimates of variable contributions than do function 
coefficients. As noted by Haase and Ellis (1987, p. 411), 
discriminant function and structure coefficients offer different 
types of information about the relationship of variables in a 
given study: 

The discriminant function coefficients reflect the 
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unique contribution of any dependent variable over 

^ above that of the remaining dependent 

variables. The structure coefficient reflects the 
total contribution of any dependent variable to the 
linear composite without taking into consideration 

A£S relation to or redundancy with the other 

dependent variables, in this sense, the structure 
coefficients are akin to factor loadings, 
(emphasis added) 
A number of researchers (e.g., Huberty, 1975, 1984; Huberty 
& Morris, 1989? Kerlinger & Pedhazur, 1973; Meredith, 1964; 
Spector, 1977; Thompson & Borrello, 1985) have recognized the 
usefulness of structure coefficients in interpreting results of 
educational experiments. Thompson and Borrello (1985) provided a 
demonstration of the superiority of structure coefficients over 
regression beta weights in a univariate (one dependent variable) 
research situation involving a high degree of collinearity among 
predictor variables. Similarly, Huberty and Morris (1989) 
demonstrated the superiority of structure coefficients over 
linear discriminant function coefficients in the multivariace 
case . 

Interestingly, however, not all researchers and 
statisticians agree that structure coefficients are necessarily 
superior to discriminant function coefficients. For instance, 
Haase and Ellis (1987, p. 411) note: 

When structure coefficients were first proposed, 
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there was some expectation that they would be more 
stable indexes than the discriminant function 
coefficients; however, the existing empirical 
evidence has neither confirmed nor disconfirmed 
this expectation. 
In a review of several studies which have compared the stability 
of the two types of coefficients in cross-validation 
applications, Bray and Maxwell (1982}, noted that these studies 
have produced mixed results. However, Bray and Maxwell note that 
in several of the studies the superiority of structure 
coefficients is closely linked to the degree of correlation among 
the variables under study, suggesting that when variables are 
highly correlated (as often they are in multivariate behavioral 
research) , structure coefficients may be the better coefficients 
to use in interpreting research results. 

It is important to note that it is often valuable to consult 
both sets of coefficients in a given analysis (Bray & Maxwell, 
1982; Thompson & Borrello, 1985}; however, as Bray and Maxwell 
(1982) and Thorndike (1978) conclude, structure coefficients may 
be the more important coefficient to use in interpreting the 
substantive nature of the synthetic variable composite as 
structure coefficients better honor the reality of the 
relationships among the variables under study. Consequently, 
Thompson (1988, p. 18) asserts: 

In an artificial forced-choice world in which only 
one coefficient could be consulted, structure 
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coefficients might be preeminent; in the real world 
both coefficients should be consulted in 
interpretation. Interpretations based solely on 
function coefficients should be eschewed. 

A Heuristic Example 

In an effort to investigate the relative merit of the three 
aforementioned approaches to interpreting MANOVA results, the 
present study employed a small hypothetical multivariate data 
set. For the sake of simplicity, a one-way design was used, with 
experimental condition serving as the three-level predictor 
variable. Three continuous criterion variables (scores on three 
subtests in an achievement battery) were specified. Data were 
analyzed for 36 subjects. These data are presented in the first 
five columns of Table 1. Following the MANOVA the three 
interpretive procedures were employed. 

The multivariate analysis of variance was conducted using 
the SPSSx MANOVA procedure. The results of this analysis are 
presented in Table 2. The analysis yielded a statistically 
significant (p_ < .01) multivariate F of 4.03. Wilks • lambda for 
the analysis was .5176, indicating an effect size of 
approximately 48%. The results of the three follow-up ANOVAs 
are presented in Table 3. 



INSERT TABLES 1. 2. AND 3 ABOUT HERE 
Only the ANOVA for dependent variable SCORE3 yielded a 
statistically significant (p_ < .001) F of 11.81 with an effect 
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size of 41.71%. In addition to being statistically 

nonsignificant, the results of the remaining two analyses are 
also far from noteworthy, with effect sizes of only 4.00 and 
8.92%. These results indicate that the SC0RE3 variable 
contributed most heavily to the differences in subjects across 
the levels of the independent variable. However, as previously 
noted, the "multiple ANOVAs" interpretive approach fails to 
address the "linear combination" issue when determining variable 
contributions. In addition, considering that a total of four 
significance tests were conducted using the same data set, the 
resulting experimentwise alpha for these analyses [1 - (1- 
alpha) k ] using a testwise alpha of .05 is approximately 18.55%, 
greatly increasing the original 5% chance that the statistically 
significant results occurred by chance. 

A discriminant analysis of the data yielded two discriminant 
functions, which may be interpreted as representing two 
underlying composite constructs represented by the data. 
Standardized discriminant function coefficients and canonical 
variate structure coefficients for this analysis are presented in 
Table 4. Consulting the two sets of function coefficients, one 
would conclude that SCORE3 weights most heavily on the first 
function, that SC0RE1 weights heavily on the second function, and 
that the near-zero weights associated with SCORE2 indicate that 
it does not contribute substantially to either function. 



INSERT TABLE 4 ABOUT HERE 
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However, consulting the structure coefficients (which are 
not affected by collinearity among the variables}, the 
conclusions are somewhat different, suggesting that the second 
synthetic variable is characterized by both the SCOREl and SCORE2 
variables. Hence, although both analyses serve to identify two 
distinct constructs underlying the outcome variables, the nature 
of the second construct is interpreted differently with the two 
types of coefficients. By consulting only the function 
coefficients in this example, the researcher may have been prone 
to eliminate the SCORE 2 variable from future research upon the 
erroneous conclusion that it does not contribute much to either 
underlying construct. 

In order to investigate further the difference in 
interpreting results using function and structure coefficients, 
two additional discriminant analyses were run, each adding an 
additional achievement score variable to the original outcome 
variable set. Scores for these two additional variables (SC0RE4 
and SCORE 5 ) are presented in the last two columns of Table 1. 
The resultant function and structure coefficients for these two 
additional discriminant analyses are presented in Table 5. 



INSERT TABLE 5 ABOUT HERE 
Consulting the function coefficients for the first of these 
two analyses (Analysis #2) one might conclude that the first 
function represents a variable construct characterized by SC0RE3 
and SC0RE4, and that the second construct primarily represents 
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SC0RE1. Again, as in the previous analysis, one might be prone 
to feel that SC0RE2 is an insignificant variable in the study as 
it does not seen to make a notable contribution to either of the 
identified functions. 

The structure coefficients for this analysis yield a 
considerably different interpretation, with SC0RE3 and SC0RE4 
correlating highly with with Function I, and with SC0RE1, SC0RE2, 
and SC0RE4 correlating highly with Function II. As noted in the 
previous analysis, the effects of collinearity on the outcome 
variables could lead to a distorted understanding of the 
statistical results, and the possible exclusion of an important 
variable (SC0RE2) from further study. 

The final analysis (ANALYSIS #3) also yielded interesting 
results. Utilizing the function coefficients, one would identify 
two underlying constructs, one characterized by SC0RE2, SC0RE3 , 
SC0RE4, and SC0RE5, and the other characterized only by SCORE1. 
It is particularly interesting that with the addition of SC0RE5 
to the outcome variable set, SC0RE2, which had previously 
weighted very minimally on either of. the functions, now appears 
to be very strongly identified with Function I. Hence, as 
previously noted, the addition of a single variable can 
sometimes have notable effects on the magnitude of the resultant 
discriminant function coefficients (Huberty & Morris, 1989). 
Utilizing the structure coefficients, one would associate the 
SC0RE3 and SC0RE5 variables with Function I, and the remaining 
three variables with Function II. 
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Discussion 

The present study sought to investigate three methods for 
assessing the nature of constructs underlying synthetic variables 
identified in multivariate analyses of variance. Use of multiple 
ANOVAs following the initial statistically significant MANOVA 
identified one variable as contributing significantly to the 
multivariate results. The two multivariate approaches 

(interpretation of discriminant function coefficients and 
interpretation of the resultant canonical variate structure 
coefficients) indicated that other variables were also worthy of 
consideration, and suggested the validity of criticisms regarding 
the appropriateness of the use of univariate ANOVAs in the 
interpretation of multivariate results. 

Although there were some similarities in the interpretation 
of underlying constructs using function versus structure 
coefficients, there were also some striking differences. First 
of all, although the structure coefficient method of 
interpretation indicated the appropriateness of considering the 
SC0RE2 variable in all three of the analyses, this variable did 
not obtain a notable weight until the third analysis using the 
function coefficient method, since function coefficients tend to 
be affected by collinearity , it is likely that this variable 
failed to obtain a notable discriminant function weight in the 
prior two analyses due to a "suppressor effect" by one of the 
other outcome variables. Secondly, the new variables introduced 
in the second and third analyses tended to obtain their higher 



18 



16 

function weights on the first discriminant function, yet these 
variables were more equally distributed across the two functions 
as judged by their structure coefficients. Thirdly, although 
both sets of coefficients shifted with each analysis, in general 
the structure coefficients remained more stable. Finally, the 
two interpretive methods tended to yield different results as the 
n of outcome variables was increased. Hence, it may be possible 
that collinearity became a larger issue as more variables were 
added to the analysis. 
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Test Name Value 

Pillais .51900 

Hotellings .86115 

Wilks .51762 

Roys .43476 



Table 2 
MANOVA Results 

Approx. F Hypoth. DF 

3.73803 6.00 
4.30577 6.00 
4.02929 6.00 



Error DF Sig. of F 

64.00 .003 

60.00 .001 

62.00 .002 
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Table 3 
Result of Subsequent ANOVAs 
(DF « 2,33) 



Variable Hypoth. SS 

SC0RE1 6.05556 

SC0RE2 4.66667 

SC0RE3 19.50000 



Error SS Hypoth. MS 

61.83333 3.02778 

112.08333 2.33333 

27.25000 9.75000 



Error MS F 

1.87374 1.61590 

3.39646 .68699 

.82576 11.80734 



Sig. of F Effect Size 

.214 8.92% 

.510 4.00% 

.000 41.71% 



Table 4 

Function and Structure Coefficient 

Analysis 



s for Initial Di 



Variable 



SCORE1 
SCORE2 
SCORE3 



Function Coefficients 
Funct. I Funct. II 



.27229 
.02083 
1.01018 



1.04842 
-.12414 
-.08978 



inant 



Structure Coefficients 
Funct. I Funct. II 



10023 
15238 
95977 



.99026 
.50838 
-.27741 



Function and Structure 



Table 5 
Coefficients 
Analyses 



ANALYSIS #2 

Variable 

SCORE1 
SCORE2 
SC0RE3 
SCORE4 

ANALYSIS #3 

Variable 

SCORE1 
SCORE2 
SCORE3 
SCORE4 
SCORES 



Function Coefficients 
Funct. I Funct. II 



46296 
02829 
81054 
92375 



1.21364 
-.12103 
.04618 
-.20919 



Function Coefficients 
Funct. I Funct. II 



27050 
68139 
70524 
52673 
94256 



1.22378 
-.14606 

.03585 
-.23057 

.03299 



for Subsequent Discriminant 



Structure Coefficients 
Funct. I Funct. II 



06330 
10938 
73479 
46618 



.98365 
.51929 
-.13563 
.59609 



Structure Coefficients 
Funct. I Funct. II 



.05339 
.09126 
.60996 
.38755 
.46928 



.98247 
.51759 
-.14597 
. 58933 
.43601 
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